Statistical Inference IV

Chelsea Parlett-Pelleriti

Bayesian Uncertainty

Bayesian Statistics

Bayesianism main Ideas:

  1. data \(X\) is fixed, and the parameters \(\theta\) of our process \(P_{\theta}\) are random

    • we imagine different parameter values that could exist
  2. inference relies on the idea of updating prior beliefs based on evidence from the data

  3. probabilities are used to quantify uncertainty we have about parameters

\[ \underbrace{p(\theta|d)}_\text{posterior} = \underbrace{\frac{p(d|\theta)}{p(d)}}_\text{update} \times \underbrace{p(\theta)}_\text{prior} \]

Bayesian Uncertainty

Bayesian Uncertainty

Bayes’ Rule

\[ P(A \mid B) = \frac{P(A\cup B)}{P(B)} = \frac{P(B \mid A) \cdot P(A)}{P(B)} \]

Bayes’ Rule is a way to calculate conditional probabilities

Bayes’ Rule

\[ P(\text{covid} \mid +) = \frac{P(+ \mid \text{covid}) \cdot P(\text{covid})}{P(+)} \]

What is the probability of having covid given that you got a positive covid test?

Bayes’ Rule

\[ P(\theta \mid \text{data}) = \frac{P(\text{data} \mid \theta) \cdot P(\theta)}{P(\text{data})} \]

“How did my theory change after seeing the data?”

Bayes’ Rule

\[ \color{#D55E00}{P(\theta \mid \text{data})} = \frac{P(\text{data} \mid \theta) \cdot P(\theta)}{P(\text{data})} \]

The Posterior is the probability of \(\theta\) after seeing the data.

Bayes’ Rule

\[ P(\theta \mid \text{data}) = \frac{P(\text{data} \mid \theta) \cdot \color{#CC79A7}{P(\theta)}}{P(\text{data})} \]

The Prior is the probability distribution of \(\theta\) before seeing the data.

Bayes’ Rule

\[ P(\theta \mid \text{data}) = \frac{\color{#F5C710}{P(\text{data} \mid \theta)} \cdot P(\theta)}{P(\text{data})} \]

The Likelihood is the probability of our data, given \(\theta\), for various different \(\theta\)s

:::

Bayes’ Rule

\[ P(\theta \mid \text{data}) = \frac{P(\text{data} \mid \theta) \cdot P(\theta)}{\color{#009E73}{P(\text{data})}} \]

The Normalizing Constant is the probability of our data.It normalizes the posterior so that it’s a valid probability (distribution).

It does not matter.

Bayes’ Rule

\[ P(\theta \mid \text{data}) = \frac{P(\text{data} \mid \theta) \cdot P(\theta)}{\color{#009E73}{P(\text{data})}} \]

The Normalizing Constant is the probability of our data.It normalizes the posterior so that it’s a valid probability (distribution).

The normalizing constant makes \(P(\theta \mid \text{data})\) a valid probability distribution (i.e. \(\int P(\theta \mid \text{data}) d \theta = 1\)) but, it’s just a scalar constant…so \(P(\text{data} \mid \theta) \cdot P(\theta) \propto P(\theta \mid \text{data})\) 👀

Bayes Rule and MCMC

\[ \left[P(\text{data} \mid \theta) \cdot P(\theta) \right] \propto P(\theta \mid \text{data}) \]

\[ \text{likelihood} \cdot \text{prior} \propto \text{posterior} \]

we have a function \(f(x) = P(\text{data} \mid \theta) \cdot P(\theta)\) that is proportional to a probability distribution \(p(x) = P(\theta \mid \text{data})\) that we want to sample from, but it itself is not a proper probability distribution…

❓ What does that remind you of?

Bayes Rule and Parameter Estimates

Note: If we have draws from our posterior distribution \(p(x) = P(\theta \mid \text{data})\), we can use these draws to calculate any statistic we want: mean, median, quantiles on the draws or transformations of the draws.

[1] "Z: 10, W: 1.01"

Bayes Rule by Hand

Flu Test

\[ P(\text{flu} \mid \text{+}) = \frac{P(\text{+} \mid \text{flu}) \cdot P(\text{flu})}{{P(\text{+})}} \]

  • \(P(\text{flu}) = 0.05\) (prevalence of flu)

  • \(P(\text{+} \mid \text{flu}) = 0.99\) (sensitivity of test)

  • \(P(\text{+} \mid \text{no flu}) = 0.1\) (1- specificity of test)

  • \(P(\text{+}) = \underbrace{P(\text{+} \mid \text{flu})\cdot P(\text{flu})}_\text{way 1}+ \underbrace{P(\text{+} \mid \text{no flu})\cdot P(\text{no flu})}_\text{way 2}\)

Math

Bayes Rule by Hand

Beta-Binomial

We’re interested in estimating \(q\) the proportion of days it rains in California. It rained 12 of the last 365 days.

  • Binomial Likelihood: \(\mathcal{L}(q \mid x) = \binom{n}{x} q^{x} (1 - q)^{n - x}\)

  • Beta Prior: \(p \sim \text{Beta}(\alpha, \beta)= \frac{q^{\alpha-1} (1-q)^{\beta-1}}{B(\alpha,\beta)}\)

Beta Priors

Beta Prior Shiny App

Play around with the app for a minute changing alpha and beta until you find a prior that looks reasonable to you.

  • \(\alpha\): “successes” (rain days)
  • \(\beta\): “failures” (no-rain days)

Bayes Rule by Hand

Beta-Binomial

We’re interested in estimating \(q\) the proportion of days it rains in California. It rained 12 of the last 365 days (\(x\)).

  • Binomial Likelihood: \(\mathcal{L}(q \mid x) = \binom{365}{12} q^{12} (1 - q)^{365-12}\)

  • Beta Prior: \(q \sim \text{Beta}(1, 9) = \frac{q^{1-1} (1-q)^{9-1}}{B(1,9)}\)

Remember: \(P(q \mid x) \underbrace{\propto \binom{365}{12} q^{12} (1 - q)^{365-12}}_\text{likelihood} \times \underbrace{\frac{q^{1-1} (1-q)^{9-1}}{B(1,9)}}_\text{prior}\)

Math

Bayes Rule by Hand

Beta-Binomial